-
-
Notifications
You must be signed in to change notification settings - Fork 583
HV-1831 Optimize cascading validation for large lists #1331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
HV-1831 Optimize cascading validation for large lists #1331
Conversation
|
39b70af
to
33dc3e4
Compare
Thanks for your pull request! This pull request appears to follow the contribution rules. › This message was automatically generated. |
33dc3e4
to
adb2082
Compare
adb2082
to
dcb3188
Compare
d2e9beb
to
e74474d
Compare
HV-1831 Enhance ExecutableMetaData with tracking information HV-1831 Create ProcessedBeansTrackingVoter contract This contract allows to override the default bean process tracking behavior without exposing our internal structures. It needs a bit more love on the config side so that we can define it via XML too and some documentation. HV-1831 New zero cost approach to processed bean tracking strategy I removed it from the traditional VF for now as I would like us to focus on the case where it is useful first. We will reintroduce it later once we have validated the approach where it is the most useful. I'm a bit unclear right now if we should use the same contract for traditional and predefined scope VF as we are dealing with different things and they won't be evaluated at the same moment. I'm thinking that maybe this needs to be a different contract. HV-1831 : Wrap a `BeanMetaData` in a `NonTrackedBeanMetaDataImpl` if tracking is not required HV-1831 Add some guidance about next step HV-1831 Specific benchmark infrastructure for predefined scope HV-1831 : Update Cascade tests to use PredefinedScopeHibernateValidator with -p=predefined=true HV-1831 : Experiment detecting cycles in bean classes Add test for Map HV-1831 : Experiment detecting cycles in bean classes Add support for containers; add tests for List w/ and w/o duplicated values HV-1831 : Experiment detecting cycles in bean classes HV-1831 Copy nodes when changing the nature of the leaf HV-1831 Add the same bean to List twice HV-1831 Clean up another experiment that shouldn't have been committed HV-1831 Add a couple of examples illustrating various cases HV-1831 Unfinished experiments Signed-off-by: marko-bekhta <marko.prykladna@gmail.com>
since the map won't be able to hold nulls Signed-off-by: marko-bekhta <marko.prykladna@gmail.com>
Signed-off-by: marko-bekhta <marko.prykladna@gmail.com>
Signed-off-by: marko-bekhta <marko.prykladna@gmail.com>
Signed-off-by: marko-bekhta <marko.prykladna@gmail.com>
Signed-off-by: marko-bekhta <marko.prykladna@gmail.com>
a5f215e
to
075d180
Compare
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added an implementation for detecting return value/parameters tracking detection and tried to go through more cases for bean tracking detection and adjusted the steps to account for more scenarios I could think of. Instead of looking at the validated type I tried to leverage the cascading metadata instead... my thinking was that we may have something like:
Map<KeyThatHasCycles, @Valid ValueNoCycles> prop;
and if we just inspect the map we'd think that we need tracking, but since there is no @Valid
for the key we actually don't. + this seems to simplify the case when the @Valid
is nested much deeper in the type arguments.
And some numbers:
with the tracking changes:
Benchmark Mode Cnt Score Error Units
CascadedWithLotsOfItemsValidation.testCascadedValidationWithLotsOfItems thrpt 20 16303.592 ± 205.026 ops/s
CascadedWithLotsOfItemsValidation.testCascadedValidationWithLotsOfItems:async thrpt NaN ---
PredefinedScopeCascadedWithLotsOfItemsValidation.testPredefinedScopeCascadedValidationWithLotsOfItems thrpt 20 24974.083 ± 276.476 ops/s
PredefinedScopeCascadedWithLotsOfItemsValidation.testPredefinedScopeCascadedValidationWithLotsOfItems:async thrpt NaN ---
Benchmark Mode Cnt Score Error Units
CascadedWithLotsOfItemsAndCyclesValidation.testCascadedValidationWithLotsOfItems thrpt 20 10825.431 ± 154.286 ops/s
CascadedWithLotsOfItemsAndCyclesValidation.testCascadedValidationWithLotsOfItems:async thrpt NaN ---
PredefinedScopeCascadedWithLotsOfItemsAndCyclesValidation.testCascadedValidationWithLotsOfItems thrpt 20 10500.602 ± 139.836 ops/s
PredefinedScopeCascadedWithLotsOfItemsAndCyclesValidation.testCascadedValidationWithLotsOfItems:async thrpt NaN ---
^ this seems reasonable, as when there are no cycles and we detect that tracking can be skipped in a predefined scope case things look much better, while with cycles in place the results are similar between the predefine scope and regular validator.
btw I also tried to do the tracking-without-maps:
tracking+no maps:
Benchmark Mode Cnt Score Error Units
CascadedWithLotsOfItemsAndCyclesValidation.testCascadedValidationWithLotsOfItems thrpt 20 11101.882 ± 113.787 ops/s
CascadedWithLotsOfItemsAndCyclesValidation.testCascadedValidationWithLotsOfItems:async thrpt NaN ---
PredefinedScopeCascadedWithLotsOfItemsAndCyclesValidation.testCascadedValidationWithLotsOfItems thrpt 20 11104.529 ± 109.809 ops/s
PredefinedScopeCascadedWithLotsOfItemsAndCyclesValidation.testCascadedValidationWithLotsOfItems:async thrpt NaN ---
and the results are only slightly better than when the maps are used ... will take a closer look at it to see if it can be improved
// TODO: signature is just name and parameters so that can clash between different beans. | ||
// with that.. do we even need to track it per signature or since | ||
// we already built the `trackingEnabledForBeans` we can just "inspect" the cascadable as we go | ||
// and check against this `trackingEnabledForBeans` to see if tracking is required ? | ||
private final Map<Signature, Boolean> trackingEnabledForReturnValues; | ||
private final Map<Signature, Boolean> trackingEnabledForParameters; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we need to prebuild these maps, but I wanted to check with you first before dropping them...
As per that TODO, it seems that we would only check an executable once when building the metadata and I think I got an impl that would rely only on the trackingEnabledForBeans
...
and since the signatures could clash ... I guess we can drop these maps, right?
this.subtypesMap = CollectionHelper.newHashMap( rawBeanMetaDataMap.size() ); | ||
for ( Class<?> beanClass : rawBeanMetaDataMap.keySet() ) { | ||
for ( Class<?> otherBeanClass : rawBeanMetaDataMap.keySet() ) { | ||
if ( beanClass.isAssignableFrom( otherBeanClass ) ) { | ||
subtypesMap.computeIfAbsent( beanClass, k -> new HashSet<>() ) | ||
.add( otherBeanClass ); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This subtype map is here to help identify subtypes that can potentially, at runtime, introduce a cycle, and if that's so, we would mark that path as requiring the bean tracking.
// We also need to account for the case when the subtype is used at runtime that may change the cycles: | ||
// 4) A -> B -> C -> D | ||
// And C1 extends C where C1 -> A | ||
// Hence, at runtime we "may" get: | ||
// A -> B -> C1 -> D | ||
// ^ | | ||
// | | | ||
// ----------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this example is for that "sub-type" map...
// directCascadedBeanClasses.add( (Class<?>) cascadable.getCascadableType() ); | ||
// | ||
// TODO: or be much more cautious and just assume that it can be "anything": | ||
directCascadedBeanClasses.add( Object.class ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thoughts here are that if we are unsure if the tracking could be required or not we should stay on the safe side and assume it would be...
these potentially-container-cascading cases are trouble anyway so I'd hope users would stay away from them ...
btw, this also reminds me of the other "problem", once we drop the support for @Valid List<MyBean>
, things with these potentially cascading cases won't really work. I've started adding some thoughts on that here: jakartaee/validation#266
Set<Class<?>> directCascadedBeanClasses = new HashSet<>(); | ||
for ( Cascadable cascadable : returnValueMetaData.getCascadables() ) { | ||
processSingleCascadable( cascadable, directCascadedBeanClasses ); | ||
} | ||
for ( Class<?> directCascadedBeanClass : directCascadedBeanClasses ) { | ||
if ( trackingEnabledForBeans.get( directCascadedBeanClass ) ) { | ||
return true; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the idea here was to reuse the logic from bean tracking detection ^ and go through cascadables, collect what "top level" classes we can cascade into and then use the trackingEnabledForBeans
map to see if tracking is required ...
ResourceBundle bundle = doGetResourceBundle( localeToPreload ); | ||
if ( bundle == null ) { | ||
LOG.resourceBundleNotPreLoaded( localeToPreload ); | ||
continue; | ||
} | ||
tmpPreloadedResourceBundles.put( localeToPreload, bundle ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
had to do this, as with the change to Map.copyOf
null
s are not allowed anymore and the startup would fail ... (noticed it when running the benchmarks)
I had a look at the numbers but I'm not sure to what I should compare them to? Could you clarify? |
I think I understand them: could you check if in the case with a cycle, we are not consistently slower? I would have expected things to be mostly transparent? |
Also both versions tested include your previous optimization, right? |
yes, all the runs are on top of main, so the previous improvements are already included.
I haven't initially compared to results from main, just predefined/default, as my thought was that the change is really about disabling the tracking ... here are the result from running the benchmark on main:
so with that compared to the case when we disable tracking:
default case matches the results from main (no tracking improvements), and the predefined is much better as we disable tracking entirely there. let me also run the one with cycles on main too... |
Here are the results for the case with cycles: main:
this patch:
It's slower than |
https://hibernate.atlassian.net/browse/HV-1831
this is only a version of #1157 for a main branch with javax->jakarta moved so far.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on licensing, please check here.